FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
نویسندگان
چکیده
منابع مشابه
Fast Parallel Mining of Frequent Itemsets
Association rule mining has become an essential data mining technique in various fields and the massive growth of the available data demands more and more computational power. To address this issue, it is necessary to study parallel implementations of such algorithms. In this paper, we propose a parallel approach to the Frequent Pattern Tree (FP-Tree) algorithm, which is a fast and popular tree...
متن کاملParallel Mining of Frequent Maximal Itemsets Using Order Preserving Generators
In this paper, we propose a parallel algorithm for mining maximal itemsets. We propose POP-MAX (Parallel Order Preserving MAXimal itemset algorithm), a fast and memory efficient parallel algorithm which enumerates all the maximal patterns concurrently and independently across several nodes. Also, POP-MAX uses an efficient maximality checking technique which determines the maximality of an items...
متن کاملMaximal Frequent Itemsets Mining Using Database Encoding
Frequent itemsets mining is a classic problem in data mining and plays an important role in data mining research for over a decade. However, the mining of the all frequent itemsets will lead to a massive number of itemsets. Fortunately, this problem can be reduced to the mining of maximal frequent itemsets. In this paper, we propose a new method for mining maximal frequent itemsets. Our method ...
متن کاملMining Frequent Itemsets Using Support Constraints
Interesting patterns often occur at varied levels of support. The classic association mining based on a uniform minimum support, such as Apriori, either misses interesting patterns of low support or suuers from the bottleneck of itemset generation. A better solution is to exploit support constraints, which specify what minimum support is required for what itemsets, so that only necessary itemse...
متن کاملMining Frequent Itemsets using Patricia Tries
We present a depth-first algorithm, PatriciaMine, that discovers all frequent itemsets in a dataset, for a given support threshold. The algorithm is main-memory based and employs a Patricia trie to represent the dataset, which is space efficient for both dense and sparse datasets, whereas alternative representations were adopted by previous algorithms for these two cases. A number of optimizati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Systems, Man, and Cybernetics: Systems
سال: 2016
ISSN: 2168-2216,2168-2232
DOI: 10.1109/tsmc.2015.2437327